AITopics | language representation model

2506.18602

Country: Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Banking & Finance > Trading (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Dashti, Seyed Mohammad Sadegh, Bardsiri, Amid Khatibi, Shahbazzadeh, Mehdi Jafari

PERCORE: A Deep Learning-Based Framework for Persian Spelling Correction with Phonetic Analysis

arXiv.org Artificial IntelligenceJul-20-2024

This research introduces a state-of-the-art Persian spelling correction system that seamlessly integrates deep learning techniques with phonetic analysis, significantly enhancing the accuracy and efficiency of natural language processing (NLP) for Persian. Utilizing a fine-tuned language representation model, our methodology effectively combines deep contextual analysis with phonetic insights, adeptly correcting both non-word and real-word spelling errors. This strategy proves particularly effective in tackling the unique complexities of Persian spelling, including its elaborate morphology and the challenge of homophony. A thorough evaluation on a wide-ranging dataset confirms our system's superior performance compared to existing methods, with impressive F1-Scores of 0.890 for detecting real-word errors and 0.905 for correcting them. Additionally, the system demonstrates a strong capability in non-word error correction, achieving an F1-Score of 0.891. These results illustrate the significant benefits of incorporating phonetic insights into deep learning models for spelling correction. Our contributions not only advance Persian language processing by providing a versatile solution for a variety of NLP applications but also pave the way for future research in the field, emphasizing the critical role of phonetic analysis in developing effective spelling correction system.

correction, error density, real-word error, (16 more...)

doi: 10.1007/s44196-024-00459-y

2407.14789

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Asia > Middle East > Iran > Kerman Province > Kerman (0.04)
Asia > East Asia (0.04)
(15 more...)

Genre:

Research Report > New Finding (0.93)
Overview (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Zhu, Chloe Qinyu, Stureborg, Rickard, Fain, Brandon

Do Not Harm Protected Groups in Debiasing Language Representation Models

arXiv.org Artificial IntelligenceNov-11-2023

Language Representation Models (LRMs) trained with real-world data may capture and exacerbate undesired bias and cause unfair treatment of people in various demographic groups. Several techniques have been investigated for applying interventions to LRMs to remove bias in benchmark evaluations on, for example, word embeddings. However, the negative side effects of debiasing interventions are usually not revealed in the downstream tasks. We propose xGAP-DEBIAS, a set of evaluations on assessing the fairness of debiasing. In this work, We examine four debiasing techniques on a real-world text classification task and show that reducing biasing is at the cost of degrading performance for all demographic groups, including those the debiasing techniques aim to protect. We advocate that a debiasing technique should have good downstream performance with the constraint of ensuring no harm to the protected group.

computational linguistic, machine learning, natural language, (18 more...)

2310.18458

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceJul-28-2022

MLRIP: Pre-training a military language representation model with informative factual knowledge and professional knowledge base

Li, Hui, Yang, Xuekang, Zhao, Xin, Yu, Lin, Zheng, Jiping, Sun, Wei

Incorporating prior knowledge into pre-trained language models has proven to be effective for knowledge-driven NLP tasks, such as entity typing and relation extraction. Current pre-training procedures usually inject external knowledge into models by using knowledge masking, knowledge fusion and knowledge replacement. However, factual information contained in the input sentences have not been fully mined, and the external knowledge for injecting have not been strictly checked. As a result, the context information cannot be fully exploited and extra noise will be introduced or the amount of knowledge injected is limited. To address these issues, we propose MLRIP, which modifies the knowledge masking strategies proposed by ERNIE-Baidu, and introduce a two-stage entity replacement strategy. Extensive experiments with comprehensive analyses illustrate the superiority of MLRIP over BERT-based models in military knowledge-driven NLP tasks.

artificial intelligence, machine learning, natural language, (18 more...)

2207.13929

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

#artificialintelligenceMar-14-2022, 12:59:41 GMT

Word2vec vs BERT

Both word2vec and BERT are recent popular methods in NLP which are used for generating vector representation of words. Essentially replacing the use of word index dictionaries and one hot encoded vectors to represent text. Both word-index and one hot encoding methods do not capture the semantic sense of language. Also, one hot encoding becomes computationally infeasible if the size of vocabulary is LARGE. Word2vec [1] is a neural network approach to learn distributed word vectors in a way that words used in similar syntactic or semantic context, lie closer to each other in the distributed vector space.

bert, representation, vector, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.05)
Europe > Spain > Galicia > Madrid (0.05)
Europe > France (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceDec-5-2021, 19:24:15 GMT

10 Must-read AI Papers

We have put together a list of 10 most cited and discussed research papers in machine learning that published over the past 10 years, from AlexNet to GPT-3. These are great readings for researchers new to this field and freshers for experienced researchers. For each paper, we provide links to the short overview, author presentations and detailed paper walkthrough for readers with different levels of expertise. Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art.

neural network, paper explanatory video, representation, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-19-2021

BERTSurv: BERT-Based Survival Models for Predicting Outcomes of Trauma Patients

Zhao, Yun, Hong, Qinghang, Zhang, Xinlu, Deng, Yu, Wang, Yuqing, Petzold, Linda

Survival analysis is a technique to predict the times of specific outcomes, and is widely used in predicting the outcomes for intensive care unit (ICU) trauma patients. Recently, deep learning models have drawn increasing attention in healthcare. However, there is a lack of deep learning methods that can model the relationship between measurements, clinical notes and mortality outcomes. In this paper we introduce BERTSurv, a deep learning survival framework which applies Bidirectional Encoder Representations from Transformers (BERT) as a language representation model on unstructured clinical notes, for mortality prediction and survival analysis. We also incorporate clinical measurements in BERTSurv. With binary cross-entropy (BCE) loss, BERTSurv can predict mortality as a binary outcome (mortality prediction). With partial log-likelihood (PLL) loss, BERTSurv predicts the probability of mortality as a time-to-event outcome (survival analysis). We apply BERTSurv on Medical Information Mart for Intensive Care III (MIMIC III) trauma patient data. For mortality prediction, BERTSurv obtained an area under the curve of receiver operating characteristic curve (AUC-ROC) of 0.86, which is an improvement of 3.6% over baseline of multilayer perceptron (MLP) without notes. For survival analysis, BERTSurv achieved a concordance index (C-index) of 0.7. In addition, visualizations of BERT's attention heads help to extract patterns in clinical notes and improve model interpretability by showing how the model assigns weights to different inputs.

bertsurv, mortality prediction, survival analysis, (10 more...)

2103.10928

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Yolo County > Davis (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceMar-9-2021, 03:51:06 GMT

microsoft/AzureML-BERT

This repo contains end-to-end recipes to pretrain and finetune the BERT (Bidirectional Encoder Representations from Transformers) language representation model using Azure Machine Learning service. That implementation uses ONNX Runtime to accelerate training and it can be used in environments with GPU including Azure Machine Learning service. Details on using ONNX Runtime for training and accelerating training of Transformer models like BERT and GPT-2 are available in the blog at ONNX Runtime Training Technical Deep Dive. BERT is a language representation model that is distinguished by its capacity to effectively capture deep and subtle textual relationships in a corpus. In the original paper, the authors demonstrate that the BERT model could be easily adapted to build state-of-the-art models for a number of NLP tasks, including text classification, named entity recognition and question answering.

azure machine learning service, language representation model, representation model, (12 more...)

Genre: Research Report (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)

#artificialintelligenceNov-28-2020, 20:36:06 GMT

Data Science Papers for Spring 2020

Pain Points, Needs, and Design Opportunities This paper is a study done on the usage of notebooks for data science. It cover a bunch of the negative impacts of using notebooks for data science. Deployment, setup, collaboration, and reliablity are a few of the examples. Quantifying the Carbon Emissions of Machine Learning Training a neural network can take a lot of computer processing power. This processing power comes at a cost to the environment.

machine learning, paper look, quantum computer, (9 more...)

Country: North America > United States > California > San Diego County > San Diego (0.08)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)

#artificialintelligenceNov-2-2020, 22:35:36 GMT

Researchers spot origins of stereotyping in AI language technologies

A team of researchers has identified a set of cultural stereotypes that are introduced into artificial intelligence models for language early in their development--a finding that adds to our understanding of the factors that influence results yielded by search engines and other AI-driven tools. "Our work identifies stereotypes about people that widely used AI language models pick up as they learn English. The models we're looking at, and others like them for other languages, are the building blocks of most modern language technologies, from translation systems to question-answering personal assistants to industry tools for resume screening, highlighting the real danger posed by the use of these technologies in their current state," says Sam Bowman, an assistant professor at NYU's Department of Linguistics and Center for Data Science and the paper's senior author. "We expect this effort and related projects will encourage future research towards building more fair language processing systems." The work dovetails with recent scholarship, such as Safiya Umoja Noble's "Algorithms of Oppression: How Search Engines Reinforce Racism" (NYU Press, 2018), which chronicles how racial and other biases have plagued widely used language technologies.

artificial intelligence, natural language, stereotype, (14 more...)

Industry: Law > Civil Rights & Constitutional Law (0.36)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)